Combining Phonology and Morphology for the Normalization of Historical Texts

نویسندگان

  • Izaskun Etxeberria
  • Iñaki Alegria
  • Larraitz Uria
  • Mans Hulden
چکیده

This paper presents a proposal for the normalization of word-forms in historical texts. To perform this task, we extend our previous research on induction of phonology and adapt it to the task of normalization. In particular, we combine our earlier models with models for learning morphology (without additional supervision). The results are mixed: induction of the segmentation of morphemes fails to directly offer significant improvements while including known morpheme boundaries in standard texts do improve results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EHU at the SIGMORPHON 2016 Shared Task. A Simple Proposal: Grapheme-to-Phoneme for Inflection

This paper presents a proposal for learning morphological inflections by a graphemeto-phoneme learning model. No special processing is used for specific languages. The starting point has been our previous research on induction of phonology and morphology for normalization of historical texts. The results show that a very simple method can indeed improve upon some baselines, but does not reach t...

متن کامل

Computing and Historical Phonology Proceedings of the Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology

We introduce the proceedings from the workshop ‘Computing and Historical Phonology: 9th Meeting of the ACL Special Interest Group for Computational Morphology and Phonology’.

متن کامل

Computing and Historical Phonology

We introduce the proceedings of the workshop ‘Computing and Historical Phonology: 9th Meeting of ACL Special Interest Group for Computational Morphology and Phonology’.

متن کامل

Welcome to the ACL Workshop on Computing and Historical Phonology, the 9th Meeting of ACL Special Interest Group for Computational Morphology and Phonology, a meeting held in conjunction with the 45th Meeting of the ACL

We introduce the proceedings from the workshop ‘Computing and Historical Phonology: 9th Meeting of the ACL Special Interest Group for Computational Morphology and Phonology’.

متن کامل

Normalizing Medieval German Texts: from rules to deep learning

The application of NLP tools to historical texts is complicated by a high level of spelling variation. Different methods of historical text normalization have been proposed. In this comparative evaluation I test the following three approaches to text canonicalization on historical German texts from 15th–16th centuries: rule-based, statistical machine translation, and neural machine translation....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016